Goto

Collaborating Authors

 dynamic fusion


SupplementaryMaterialfor" HierarchicalAdaptive ValueEstimationforMulti-modalVisual ReinforcementLearning "

Neural Information Processing Systems

Section C describes the details of the experimental setup, including network architectures, hyperparameters,andhardwaredetails. Thisoutcomeemphasizes the necessity of feature interaction or feature fusion to tackle intricate situations. Furthermore, an amalgamation of feature fusion and value fusion can offer better performance. This adjustment allows us to evaluate the robustness and adaptability of our approach in handling a larger number of vehicles in the environment. As we increase the number of vehicles on the road, Fig. A2 (a) clearly indicates that HAVE consistently delivers the highest performance. The training and testing curves of HAVE and other comparable methods are given in A4.


Dynamic Fusion of Eye Movement Data and Verbal Narrations in Knowledge-rich Domains

Neural Information Processing Systems

We propose to jointly analyze experts' eye movements and verbal narrations to discover important and interpretable knowledge patterns to better understand their decision-making processes. The discovered patterns can further enhance data-driven statistical models by fusing experts' domain knowledge to support complex human-machine collaborative decision-making. Our key contribution is a novel dynamic Bayesian nonparametric model that assigns latent knowledge patterns into key phases involved in complex decision-making. Each phase is characterized by a unique distribution of word topics discovered from verbal narrations and their dynamic interactions with eye movement patterns, indicating experts' special perceptual behavior within a given decision-making stage. A new split-merge-switch sampler is developed to efficiently explore the posterior state space with an improved mixing rate. Case studies on diagnostic error prediction and disease morphology categorization help demonstrate the effectiveness of the proposed model and discovered knowledge patterns.


Supplementary Material for " Hierarchical Adaptive Value Estimation for Multi-modal Visual Reinforcement Learning " Y angru Huang

Neural Information Processing Systems

The contents of this supplementary material are organized as follows: Section A provides additional experimental results, including more results with three modalities, performance under dynamic weathers, performance under several challenging or extreme environmental conditions ( e.g., increased number of vehicles and dazzling sunlight), results on DeepMind Control Suit, and ablation study of auxiliary losses and the design of re-fusion. Section B provides further discussions related to our approach. This includes a comparison between value-level dynamic fusion and feature-level dynamic fusion supported by empirical results, the advantages of hierarchical bi-level fusion over uni-level fusion, and the relationship and differences between our approach and the value decomposition techniques in multi-agent RL. Section C describes the details of the experimental setup, including network architectures, hyper-parameters, and hardware details. Section D states the potential negative societal impacts of our work.


Review for NeurIPS paper: Dynamic Fusion of Eye Movement Data and Verbal Narrations in Knowledge-rich Domains

Neural Information Processing Systems

Weaknesses: I would have liked to have seen more examples in the discussion of the topics that were detected. It would be helpful if, in Table 1 and other similar illustrations the different topics that the colored words correspond to where explicitly indicated. In the supplementary material the table showing topics (Table 4) is useful, but I am curious to understand more about the links between the works in each topic category. Regarding baselines, I realize in multimodal problems, especially those using modalities that are frequently not employed (e.g., eye tracking) it is difficult to find state of the art models that are appropriate. So this is not a major criticism but it does feel that perhaps the justification of the chosen baselines could be added to.


Review for NeurIPS paper: Dynamic Fusion of Eye Movement Data and Verbal Narrations in Knowledge-rich Domains

Neural Information Processing Systems

This paper has a lot of content: Interesting cognitive science question of modelling human decision-making, data fusion of texts and eye movements, modelled with a new dynamic Bayesian nonparametric model, and introduces a new sampler for the model. This paper received a special amount of attention, 5 reviews which were needed because the paper makes several different kinds of contributions. Hence it is not a stereotypical good conference paper having one neat idea and presenting convincing theoretical or empirical support for it. Reviewers discussed the paper intensively, concluding that the paper is likely to be interesting at NeurIPS, and since there is not easy fix to make it more suitable to the format such as dividing it into two papers, it is good enough to be accepted though not among the best papers. Clarity can easily be improved by the authors, and additional details added in both the paper and the supplement.


Dynamic Fusion of Eye Movement Data and Verbal Narrations in Knowledge-rich Domains

Neural Information Processing Systems

We propose to jointly analyze experts' eye movements and verbal narrations to discover important and interpretable knowledge patterns to better understand their decision-making processes. The discovered patterns can further enhance data-driven statistical models by fusing experts' domain knowledge to support complex human-machine collaborative decision-making. Our key contribution is a novel dynamic Bayesian nonparametric model that assigns latent knowledge patterns into key phases involved in complex decision-making. Each phase is characterized by a unique distribution of word topics discovered from verbal narrations and their dynamic interactions with eye movement patterns, indicating experts' special perceptual behavior within a given decision-making stage. A new split-merge-switch sampler is developed to efficiently explore the posterior state space with an improved mixing rate.


Play to the Score: Stage-Guided Dynamic Multi-Sensory Fusion for Robotic Manipulation

Feng, Ruoxuan, Hu, Di, Ma, Wenke, Li, Xuelong

arXiv.org Artificial Intelligence

Humans possess a remarkable talent for flexibly alternating to different senses when interacting with the environment. Picture a chef skillfully gauging the timing of ingredient additions and controlling the heat according to the colors, sounds, and aromas, seamlessly navigating through every stage of the complex cooking process. This ability is founded upon a thorough comprehension of task stages, as achieving the sub-goal within each stage can necessitate the utilization of different senses. In order to endow robots with similar ability, we incorporate the task stages divided by sub-goals into the imitation learning process to accordingly guide dynamic multi-sensory fusion. We propose MS-Bot, a stage-guided dynamic multi-sensory fusion method with coarse-to-fine stage understanding, which dynamically adjusts the priority of modalities based on the fine-grained state within the predicted current stage. We train a robot system equipped with visual, auditory, and tactile sensors to accomplish challenging robotic manipulation tasks: pouring and peg insertion with keyway. Experimental results indicate that our approach enables more effective and explainable dynamic fusion, aligning more closely with the human fusion process than existing methods.


Not All Frequencies Are Created Equal:Towards a Dynamic Fusion of Frequencies in Time-Series Forecasting

Zhang, Xingyu, Zhao, Siyu, Song, Zeen, Guo, Huijie, Zhang, Jianqi, Zheng, Changwen, Qiang, Wenwen

arXiv.org Artificial Intelligence

Long-term time series forecasting is a long-standing challenge in various applications. A central issue in time series forecasting is that methods should expressively capture long-term dependency. Furthermore, time series forecasting methods should be flexible when applied to different scenarios. Although Fourier analysis offers an alternative to effectively capture reusable and periodic patterns to achieve long-term forecasting in different scenarios, existing methods often assume high-frequency components represent noise and should be discarded in time series forecasting. However, we conduct a series of motivation experiments and discover that the role of certain frequencies varies depending on the scenarios. In some scenarios, removing high-frequency components from the original time series can improve the forecasting performance, while in others scenarios, removing them is harmful to forecasting performance. Therefore, it is necessary to treat the frequencies differently according to specific scenarios. To achieve this, we first reformulate the time series forecasting problem as learning a transfer function of each frequency in the Fourier domain. Further, we design Frequency Dynamic Fusion (FreDF), which individually predicts each Fourier component, and dynamically fuses the output of different frequencies. Moreover, we provide a novel insight into the generalization ability of time series forecasting and propose the generalization bound of time series forecasting. Then we prove FreDF has a lower bound, indicating that FreDF has better generalization ability. Extensive experiments conducted on multiple benchmark datasets and ablation studies demonstrate the effectiveness of FreDF.


A Novel Adaptive Kernel for the RBF Neural Networks

Khan, Shujaat, Naseem, Imran, Togneri, Roberto, Bennamoun, Mohammed

arXiv.org Machine Learning

Abstract--In this paper, we propose a novel adaptive kernel for the radial basis function (RBF) neural networks. In [12] a novel RBF network with the multi-kernel is proposed to obtain an optimized and I. INTRODUCTION The unknown centres of the multikernels The RBF neural networks have shown excellent performance are determined by an improved k-means clustering in a number of problems of practical interest. An orthogonal least squares (OLS) algorithm is reservoirs of brine are analyzed for physicochemical properties used to determine the remaining parameters. The convergence of the ACA is analyzed by the [3] the RBF kernel is used to predict the pressure gradient Lyapunov criterion. In the context of nuclear physics, RBF Cognitive Radial Basis Function network (McRBFN) and its has been effectively used to model the stopping power data Projection based Learning (PBL) referred to as PBL-McRBFN of materials as in [4].